Systems, methods, and devices for fault resilient storage

ABSTRACT

A method of operating a storage device may include determining a fault condition of the storage device, selecting a fault resilient mode based on the fault condition of the storage device, and operating the storage device in the selected fault resilient mode. The selected fault resilient mode may include one of a power cycle mode, a reformat mode, a reduced capacity read-only mode, a reduced capacity mode, a reduced performance mode, a read-only mode, a partial read-only mode, a temporary read-only mode, a temporary partial read-only mode, or a vulnerable mode. The storage device may be configured to perform a namespace capacity management command received from the host. The namespace capacity management command may include a resize subcommand and/or a zero-size namespace subcommand. The storage device may report the selected fault resilient mode to a host.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.17/232,144, filed Apr. 15, 2021, which claims priority to, and thebenefit of, U.S. Provisional Patent Application Ser. No. 63/023,243filed May 11, 2020 which is incorporated by reference; U.S. ProvisionalPatent Application Ser. No. 63/128,001 filed Dec. 18, 2020 which isincorporated by reference; U.S. Provisional Patent Application Ser. No.63/051,158 filed Jul. 13, 2020 which is incorporated by reference; U.S.Provisional Patent Application Ser. No. 63/052,854 filed Jul. 16, 2020which is incorporated by reference; and U.S. Provisional PatentApplication Ser. No. 63/057,744 titled filed Jul. 28, 2020 which isincorporated by reference.

TECHNICAL FIELD

This disclosure relates generally to storage, and more specifically tosystems, methods, and devices for fault resilient storage.

BACKGROUND

A storage device may encounter a fault condition that may affect theability of the storage device to operate in a storage system.

The above information disclosed in this Background section is only forenhancement of understanding of the background of the invention andtherefore it may contain information that does not constitute prior art.

SUMMARY

A method of operating a storage device may include determining a faultcondition of the storage device, selecting a fault resilient mode basedon the fault condition of the storage device, and operating the storagedevice in the selected fault resilient mode. The selected faultresilient mode may include a power cycle mode. The selected faultresilient mode may include a reformat mode. The selected fault resilientmode may include a reduced capacity read-only mode. The selected faultresilient mode may include a reduced capacity mode. The selected faultresilient mode may include a reduced performance mode. The selectedfault resilient mode may include a read-only mode. The selected faultresilient mode may include a partial read-only mode. The selected faultresilient mode may include a temporary read-only mode. The selectedfault resilient mode may include a temporary partial read-only mode. Theselected fault resilient mode may include a vulnerable mode. Theselected fault resilient mode may include a normal mode. The storagedevice may be configured to perform a command received from a host. Thecommand may include a namespace capacity management command. Thenamespace capacity management command may include a resize subcommand.The namespace capacity management command may include a zero-sizenamespace command.

A storage device may include a storage medium, and a storage controller,wherein the storage controller is configured to determine a faultcondition of the storage device, select a fault resilient mode based onthe fault condition of the storage device, and operate the storagedevice in the selected fault resilient mode. The selected resilient modemay include one of a power cycle mode, a reformat mode, a reducedcapacity read-only mode, a reduced capacity mode, a reduced performancemode, a read-only mode, a partial read-only mode, a temporary read-onlymode, a temporary partial read-only mode, or a vulnerable mode. Thestorage device may be to perform a namespace capacity management commandreceived from a host.

A system may include a host, and at least one storage device coupled tothe host, wherein the storage device is configured to determine a faultcondition of the storage device, select a fault resilient mode based onthe fault condition of the storage device, operate in the selected faultresilient mode, and report the selected fault resilient mode to thehost.

A method of operating a storage array may include determining a firstfault resilient operating mode of a first fault resilient storage deviceof the storage array, determining a second fault resilient operatingmode of a second fault resilient storage device of the storage array,allocating one or more rescue spaces of one or more additional faultresilient storage devices of the storage array, mapping user data fromthe first fault resilient storage device to the one or more rescuespaces, and mapping user data from the second fault resilient storagedevice to the one or more rescue spaces. The method may further includereassigning at least one device identifier (ID) of the one or moreadditional fault resilient storage devices to a device ID of the firstfault resilient storage device. The at least one device ID of the one ormore additional fault resilient storage devices may be reassigned basedon a current unaffected device ID and a current faulty device ID. Themethod may further include redirecting one or more inputs and/or outputs(IOs) from the first fault resilient storage device to the one or moreadditional fault resilient storage devices. The user data may include astrip of data. The strip of data may be redirected to a target storagedevice of the one or more additional fault resilient storage devicesbased on a stripe ID of the user data. Mapping user data from the firstfault resilient storage device to the one or more rescue spaces mayinclude maintaining a first mapping table. Mapping user data from thesecond fault resilient storage device to the one or more rescue spacesmay include maintaining a second mapping table. The one or more rescuespaces may have a rescue space percentage ratio of a storage devicecapacity. The rescue space percentage ratio may be greater than or equalto a number of failed storage devices accommodated by the storage array,divided by the total number of storage devices in the storage array. Theone or more rescue spaces may be allocated statically. The one or morerescue spaces may be allocated dynamically.

A system may include a storage array including a first fault resilientstorage device, a second fault resilient storage device, one or moreadditional fault resilient storage devices, and a volume mangerconfigured to: determine a first fault resilient operating mode of thefirst fault resilient storage device, determine a second fault resilientoperating mode of the second fault resilient storage device, allocateone or more rescue spaces of one or more additional fault resilientstorage devices of the storage array, map user data from the first faultresilient storage device to the one or more rescue spaces, and map userdata from the second fault resilient storage device to the one or morerescue spaces. The volume manger may be further configured to reassignat least one device identifier (ID) of the one or more additional faultresilient storage devices to a device ID of the first fault resilientstorage device. The volume manger may be further configured to redirectone or more inputs and/or outputs (IOs) from the first fault resilientstorage device to the one or more additional fault resilient storagedevices. The user data may include a strip of data, and the volumemanger may be further configured to redirect the strip of data to atarget storage device of the one or more additional fault resilientstorage devices based on a stripe ID of the user data. The one or morerescue spaces have a rescue space percentage ratio of a storage devicecapacity, and the rescue space percentage ratio may be based on a numberof failed storage devices accommodated by the storage array, divided bya total number of storage devices in the storage array.

An apparatus may include a volume manager for a storage array, thevolume manager may include logic configured to: determine a first faultresilient operating mode of a first fault resilient storage device ofthe storage array, determine a second fault resilient operating mode ofa second fault resilient storage device of the storage array, allocateone or more rescue spaces of one or more additional fault resilientstorage devices of the storage array, map user data from the first faultresilient storage device to the one or more rescue spaces, and map userdata from the second fault resilient storage device to the one or morerescue spaces. The user data may include a strip of data, and the stripof data may be redirected to a target storage device of the one or moreadditional fault resilient storage devices based on a stripe identifier(ID) of the user data. The one or more rescue spaces have a rescue spacepercentage ratio of a storage device capacity, and the rescue spacepercentage ratio may be based on a number of failed storage devicesaccommodated by the storage array, divided by a total number of storagedevices in the storage array.

A method of operating a storage array may include allocating a firstrescue space of a first fault resilient storage device of the storagearray, allocating a second rescue space of a second fault resilientstorage device of the storage array, determining a fault resilientoperating mode of a third fault resilient storage device of the storagearray, and mapping user data from the third fault resilient storagedevice to the first rescue space and the second rescue space based ondetermining the fault resilient operating mode. A first block of theuser data may be mapped to the first rescue space, and a second block ofthe user data may be mapped to the second rescue space. The user datamay include a strip of data. A first portion of the strip of data may bemapped to the first rescue space, and the first portion of the strip ofdata may include a number of data blocks based on a size of the strip ofdata and a size of the data blocks. The number of data blocks may befurther based on a total number of storage devices in the storage array.The method may further include reassigning at least one deviceidentifier (ID) of the first fault resilient storage device to a deviceID of the third fault resilient storage device. The method may furtherinclude redirecting one or more inputs and/or outputs (IOs) from thethird fault resilient storage device to the first rescue space and thesecond rescue space. The first rescue space may have a capacity based ona capacity of the first fault resilient storage device and a totalnumber of storage devices in the storage array. The first rescue spacemay have a capacity of strips based on a size of the first rescue spaceand a block size.

A system may include a storage array including a first fault resilientstorage device, a second fault resilient storage device, a third faultresilient storage device, and a volume manger configured to allocate afirst rescue space of the first fault resilient storage device, allocatea second rescue space of the second fault resilient storage device,determine a fault resilient operating mode of the third fault resilientstorage device, and map user data from the third fault resilient storagedevice to the first rescue space and the second rescue space based ondetermining the fault resilient operating mode. The volume manager maybe further configured to map a first block of the user data to the firstrescue space, and map a second block of the user data to the secondrescue space. The user data may include a strip of data, and the volumemanager may be further configured to map a first portion of the strip ofdata to the first rescue space. The first portion of the strip of datamay include a number of data blocks based on a size of the strip of dataand a size of the data blocks. The number of data blocks may be furtherbased on a total number of storage devices in the storage array.

A method of operating a storage array may include determining a firstparameter of a first fault resilient storage device of the storagearray, determining a second parameter of a second fault resilientstorage device of the storage array, and determining aquality-of-service (QoS) of the storage array based on the firstparameter and the second parameter. The method may further includeadjusting the first parameter based on the QoS. The first parameter maybe adjusted automatically based on monitoring the first parameter. Thefirst parameter may be adjusted automatically based on monitoring thesecond parameter. The first parameter may be adjusted by configuring acomponent of the storage array. The first parameter may be adjusted bycontrolling the operation of a component of the storage array. The firstparameter may include one of a number of storage devices in the storagearray, a number of data blocks in a strip of user data for the firstfault resilient storage device, a write method for redirecting data fromthe first fault resilient storage device to the second fault resilientstorage device, a number of faulty storage devices supported by thestorage array, or a storage capacity of the first fault resilientstorage device.

A system may include a storage array including a first fault resilientstorage device, a second fault resilient storage device, and a volumemanger configured to determine a first parameter of a first faultresilient storage device, determine a second parameter of a second faultresilient storage device, and determine a quality-of-service (QoS) ofthe storage array based on the first parameter and the second parameter.The volume manger may be further configured to adjust the firstparameter based on the QoS. The volume manger may be further configuredto adjust the first parameter automatically based on monitoring thefirst parameter. The volume manger may be further configured to adjustthe first parameter automatically based on monitoring the secondparameter.

BRIEF DESCRIPTION OF THE DRAWINGS

The figures are not necessarily drawn to scale and elements of similarstructures or functions may generally be represented by like referencenumerals or portions thereof for illustrative purposes throughout thefigures. The figures are only intended to facilitate the description ofthe various embodiments described herein. The figures do not describeevery aspect of the teachings disclosed herein and do not limit thescope of the claims. To prevent the drawing from becoming obscured, notall of the components, connections, and the like may be shown, and notall of the components may have reference numbers. However, patterns ofcomponent configurations may be readily apparent from the drawings. Theaccompanying drawings, together with the specification, illustrateexample embodiments in accordance with the disclosure, and, togetherwith the description, serve to explain the principles of the presentdisclosure.

FIG. 1 illustrates an embodiment of a storage system in accordance withexample embodiments of the disclosure.

FIG. 2A illustrates a table of some possible fault conditions that maybe encountered by an embodiment of a fault resilient storage device inaccordance with example embodiments of the disclosure.

FIG. 2B illustrates a table of some example embodiments of faultresilient modes and associated space types that may be implemented by astorage device in accordance with example embodiments of the disclosure.

FIG. 2C illustrates a table of some example embodiments of commands andsubcommands that may be implemented by a storage device in accordancewith example embodiments of the disclosure.

FIG. 2D illustrates a table of commands that a storage device inaccordance with example embodiments of the disclosure may implementthrough an API.

FIG. 3A illustrates a flow chart of an embodiment of a method foroperating in a fault resilient mode in accordance with exampleembodiments of the disclosure.

FIG. 3B illustrates a flow chart of an embodiment of a method ofoperating a storage device in accordance with example embodiments of thedisclosure.

FIG. 4A illustrates a schematic data layout diagram of a RAID-0 systemperforming a write operation in accordance with example embodiments ofthe disclosure.

FIG. 4B illustrates a schematic data layout diagram of a RAID-0 systemperforming a read operation in accordance with example embodiments ofthe disclosure.

FIG. 4C illustrates a schematic data layout diagram of a RAID-0 systemperforming a remapping and write operation in accordance with exampleembodiments of the disclosure.

FIG. 5A illustrates a flowchart for a method for operating a RAID-0system in accordance with example embodiments of the disclosure.

FIG. 5B illustrates a flow chart showing details of a method foroperating a RAID-0 storage system in accordance with example embodimentsof the disclosure.

FIG. 6 illustrates a schematic diagram of an embodiment of a RAID-0system in accordance with example embodiments of the disclosure.

FIG. 7 illustrates a schematic diagram of an embodiment of a RAID-0system that may implement rescue space management with data blockwriting in accordance with example embodiments of the disclosure.

FIG. 8 illustrates an example embodiment of a system for implementingquality-of service (QoS) management in a storage system in accordancewith example embodiments of the disclosure.

FIG. 9 illustrates an embodiment of a method of operating a storagearray in accordance with example embodiments of the disclosure.

FIG. 10 illustrates an embodiment of another method of operating astorage array in accordance with example embodiments of the disclosure.

FIG. 11 illustrates an embodiment of a further method of operating astorage array in accordance with example embodiments of the disclosure.

DETAILED DESCRIPTION Overview

Some of the principles in accordance with example embodiments of thedisclosure relate to storage devices that may continue to operate in oneor more fault resilient modes in case of a fault of the storage device.For example, a storage device may continue to operate in a limitedmanner that may enable a storage system to recover quickly and/orefficiently from the fault of the storage drive.

In some embodiments, a storage device may implement any number of thefollowing fault resilient (FR) modes:

Some embodiments may implement a power cycle mode which may involveself-healing based on power cycling the storage device.

Some embodiments may implement a reformat mode which may involveself-healing based on formatting all or a portion of the storage device.

Some embodiments may implement a reduced capacity read-only mode inwhich a first portion of the storage space of the storage device mayoperate normally, and a second portion may operate as read-only storagespace.

Some embodiments may implement a reduced capacity mode in which a firstportion of the storage space of the storage device may operate normally,and a second portion may not be available for input and/or output (I/O)operations.

Some embodiments may implement a reduced performance mode in which oneor more aspects of the performance of the storage device may be reduced.

Some embodiments may implement a read-only mode in which data may beread from, but not written to, the storage device.

Some embodiments may implement a partial read-only mode in which a firstportion of the storage space of the storage device may operate asread-only storage space, and a second portion may not be available fornormal input and/or output (I/O) operations.

Some embodiments may implement a temporary read-only mode in which datamay be read from, but not written to, the storage space of the storagedevice, which may be temporarily valid, and may become invalid.

Some embodiments may implement a temporary partial read-only mode inwhich data may be read from, but not written to, a first portion of thestorage space of the storage device, which may be temporarily valid, andmay become invalid. A second portion may not be available for inputand/or output (I/O) operations.

Some embodiments may implement a vulnerable mode in which the storagedevice may not be available for input and/or output (I/O) operations.

Some embodiments may implement a normal mode in which the storage devicemay operate normally.

In some embodiments, a storage device may implement one or more commandswhich may be used by a host to determine and/or manage one or morefeatures of the storage device. For example, in some embodiments, astorage device may implement a namespace capacity management commandwhich may include a resize and/or zero-size subcommand.

The principles disclosed herein have independent utility and may beembodied individually, and not every embodiment may utilize everyprinciple. However, the principles may also be embodied in variouscombinations, some of which may amplify the benefits of the individualprinciples in a synergistic manner.

Storage Systems

FIG. 1 illustrates an embodiment of a storage system in accordance withexample embodiments of the disclosure. The embodiment illustrated inFIG. 1 may include a host 105 and one or more storage devices 110. Someor all of the one or more storage devices 110 may be connected directlyto the host 105, and some or all of the one or more storage devices 110may be connected to the host 105 through a volume manager 115 as shownin FIG. 1. Each storage device 110 may include a storage controller 120(or “control circuit”) and a storage media 125. In some embodiments, astorage device 110 may experience an internal fault condition, and thestorage device may exhibit various fault resilient behaviors, asdiscussed in further detail below, to mitigate the system-level impactof the fault condition.

The one or more storage devices 110 may be implemented with any type ofstorage apparatus and associated storage media including solid statedrives (SSDs), hard disk drives (HDDs), optical drives, drives based onany type of persistent memory such as cross-gridded nonvolatile memorywith bulk resistance change, and/or the like, and/or any combinationthereof. Data in each storage device may be arranged as blocks,key-value structures, and/or the like, and/or any combination thereof.Each storage device 110 may have any form factor such as 3.5 inch, 2.5inch, 1.8 inch, M.2, MO-297, MO-300, Enterprise and Data Center SSD FormFactor (EDSFF) and/or the like, using any connector configuration suchas Serial ATA (SATA), Small Computer System Interface (SCSI), SerialAttached SCSI (SAS), U.2, and/or the like, and using any storageinterface and/or protocol such as Peripheral Component Interconnect(PCI), PCI express (PCIe), Nonvolatile Memory Express (NVMe),NVMe-over-Fabrics (NVMe-oF), Ethernet, InfiniBand, Fibre Channel, and/orthe like. Some embodiments may be implemented entirely or partiallywith, and/or used in connection with, a server chassis, server rack,dataroom, datacenter, edge datacenter, mobile edge datacenter, and/orany combinations thereof, and/or the like.

Any or all of the host 105, volume manager 115, storage controller 120,and/or any other components disclosed herein may be implemented withhardware, software, or any combination thereof, including combinationallogic, sequential logic, one or more timers, counters, registers, statemachines, complex programmable logic devices (CPLDs), field programmablegate arrays (FPGAs), application specific integrated circuits (ASICs),complex instruction set computer (CISC) processors and/or reducedinstruction set computer (RISC) processors, and/or the like executinginstructions stored in volatile memories such as dynamic random accessmemory (DRAM) and/or static random access memory (SRAM), nonvolatilememory such as flash memory and/or the like, as well as graphicsprocessing units (GPUs), neural processing units (NPUs), and/or thelike.

Although the inventive principles are not limited to any particularimplementation details, for purposes of illustration, in someembodiments, each storage device 110 may be implemented as an SSD inwhich the storage media may be implemented, for example, with not AND(NAND) flash memory, and each storage controller 120 may implement anyfunctionality associated with operating the SSD including a flashtranslation layer (FTL), a storage interface, and any functionalityassociated with implementing the fault resilient features disclosedherein. The smallest erasable unit in the storage device 110 may bereferred to as a “block” and the smallest writeable unit in the storagedevice 110 may be referred to as a “page”.

The storage media 125 may have a retention period (which may depend onthe usage history of the storage media 125, and, as such, may varywithin the storage media 125); data that has been stored longer than theretention period (i.e., data having an age exceeding the retentionperiod) may become unreliable and may be said to have expired. Data maybe stored in the storage media 125 using an error correcting code, whichmay be, e.g., a block code. When data is read from the storage media125, a quantity of raw data, referred to as a code block, may be readfrom the storage media 125, and an attempt to decode it may be made. Ifthe attempt fails, additional attempts (e.g., read retrials) may bemade. With use, a portion, e.g., a block, of the storage media 125 maydegrade to a point that the retention period becomes unacceptably short,and the block may be classified as a “bad block”. To avoid allowing thiscircumstance to render the entire storage media 125 inoperable, reservespace, referred to as “bad block management reserve space” may bepresent (e.g., included in each flash memory die or in each flash memoryplane), and the controller 120, or another controller internal to theflash memory die or to the flash memory plane may begin to use a blockin the reserve and cease to use the bad block.

The operations and/or components described with respect to theembodiment illustrated in FIG. 1, as well as all of the otherembodiments described herein, are example operations and/or components.In some embodiments, some operations and/or components may be omittedand/or other operations and/or components may be included. Moreover, insome embodiments, the temporal and/or spatial order of the operationsand/or components may be varied. Although some components may beillustrated as individual components, in some embodiments, somecomponents shown separately may be integrated into single components,and/or some components shown as single components may be implementedwith multiple components.

Fault Conditions

FIG. 2A illustrates a table of some possible fault conditions that maybe encountered by an embodiment of a fault resilient storage device inaccordance with example embodiments of the disclosure. Each faultcondition (or “fault state”) may be labeled with a case identifier(“Case ID”) in the first column. The second column may indicate anoperation status of the storage in the fault state. The third column ofthe table may indicate, for each case, whether valid user data remainavailable. The fourth column of the table may indicate whether thestorage device 110 may eventually be returned to full functionality, forexample, by reformatting the storage media 125.

Case 1 may include a fault condition in which the storage device 110 mayno longer be capable of performing read or write operations, and thatmay not be resolved by cycling power and/or reformatting the storagemedia. A state in which the storage device 110 behaves in this mannermay have various sub-states, with, e.g., each sub-state corresponding toa different failure mechanism. Such a state, or fault condition (inwhich the storage device 110 is no longer capable of performing read orwrite operations, and that may not be resolved by cycling power orreformatting the storage media) may be caused, for example, by a portionof the controller's firmware becoming corrupted (in which case it may bepossible for the controller to restart into a safe mode, in which thecorrupted instructions may not be executed) or by a failure of aprocessing circuit in the storage device 110 (e.g., the failure of aprocessing circuit that manages interactions with the storage media butis not responsible for communications with the host 105). When a faultcondition of this type occurs, the storage device 110 may respond to aread or write command from the host 105 with an error message.

Case 2 may include a fault condition (i) in which the storage device 110may no longer be capable of performing read or write operations and (ii)from which recovery may be possible by cycling the power of the storagedevice 110, by reformatting the storage media (e.g., nonvolatile memory(NVM)), by re-loading firmware, and/or the like. Such a fault conditionmay be caused, for example, by a program execution error of thecontroller 120 of the storage device 110 (e.g., a pointer that is out ofrange as a result of a bit flip in random-access memory (RAM) of thecontroller 120, or an instruction that is incorrect, as a result of abit flip). If the program execution error has not caused the controller120 to write incorrect data to the storage media 125 (e.g., if theprogram execution error occurred since the most recent write to storagemedia by the controller), then power cycling the storage device may besufficient to restore the storage device 110 to normal operation. If theprogram execution error has caused the controller 120 to write erroneousdata to the storage media 125, then reformatting the storage media 125,and/or re-loading firmware may be sufficient to restore the storagedevice 110 to normal operation.

Case 3 may include a fault condition that may be mitigated by operatingthe storage device 110 in a read-only mode, and for which reformattingthe storage media 125 may not restore full functionality. Examples ofsuch faults may include (i) a temperature sensor failure, and (ii) aportion of the storage media 125 having transitioned to a read-onlymode. In the case of the temperature sensor failure, the failure may bedetected by determining that a temperature sensor reading is out ofrange (e.g., has exceeded a threshold temperature), and in such a casethe risk of overheating of the storage device 110 may be reduced byavoiding write operations, which may dissipate more power than readoperations. The transitioning to a read-only mode of a portion of thestorage media 125 may occur, for example, for flash memory storage media125, if a flash memory plane or die exhausts a bad block managementreserve space used for run time bad block management. For example, thestorage device 110 may, while attempting to performing a read operation,make an unsuccessful attempt to decode a data item, determine that theblock storing the data is a bad block and upon moving the data from thebad block to the bad block management reserve space, determine that theremaining bad block management reserve space is less than a thresholdsize and therefore insufficient to ensure the reliability of the planeor die. The storage device 110 may then determine that bad blockmanagement is no longer being performed, and transition to a read-onlymode. In some embodiments, data item may refer to any quantity of databeing processed in one operation, e.g., the data resulting from decodinga code block may be a data item.

Case 4 may include fault condition that may be mitigated by operatingthe storage device 110 in a write-through mode. For example, if a powersupply backup capacitor in the storage device 110 fails, the device may,in response to a write command received from the host, complete thewrite to the storage media 125 before sending a command completion tothe host 105, so that if power fails before the write to the storagemedia 125 has been completed, the host is not incorrectly informed thatthe write was completed successfully. Operating in the write-throughmode may result in a reduction of performance (e.g., in terms ofthroughput and/or latency).

Case 5 may include a fault condition that may be mitigated by operatingthe storage device 110 in a manner that reduces power dissipation. Forexample, in the case of a temperature sensor failure, the storage device110 may operate in a read-only mode as mentioned above, or it may reducethe rate at which operations (e.g., write operations, which maydissipate more power than read operations) may be performed, to reducepower dissipation in the storage device 110. For example, the storagedevice 110 may perform a first write to the storage media, then wait,during an interval corresponding to the reduced performance (the waitingresulting in a decrease in the rate at which write operations areperformed), and then perform another (e.g., a second) write to thestorage media.

Case 6 may include a fault condition that may be mitigated by operatingthe storage device 110 in a read-only mode, and for which reformattingthe storage media 125 may restore full functionality.

Fault Resiliency

Based on one or more fault conditions such as those exemplified by thecases listed in FIG. 2A, in some embodiments, various levels of faultresiliency may be implemented by a storage device 110 in accordance withexample embodiments of the disclosure. For example, some embodiments mayimplement a fully resilient mode, a partially resilient mode, and avulnerable mode. In the fully resilient mode, the storage device 110 mayoperate with self-healing features, and the storage device 110 may becapable of recovering full functionality (although the user data in thedevice may be lost) by resetting operations such as power cycling,re-loading firmware, or formatting of the storage media 125.

In the partially resilient mode, the storage device 110 may operate withlower performance, reduced capacity, or reduced capability, when a faultcondition exists. For example, as mentioned above, if a power supplybackup capacitor fails, writes may be completed (e.g., commandcompletions may be sent to the host 105) only after data is written tothe storage media 125 (i.e., synchronous writes may be performed),slowing the operation of the storage device 110, and reducing itsperformance. The user data may be preserved in this circumstance. Asanother example, storage device 110 may operate with reduced capacity ifthe reserve space for bad block management run time bad block (RTBB) isexhausted. In this circumstance, the affected dies in the storage device110 may be excluded from the disk space and the overall disk capacitymay be reduced. The user data on the lost space may be lost. Forexample, if a set in IO determinism or a zone in a zoned namespace is nolonger capable of accepting new data writes, the set or the zone may beexcluded from disk space but the remaining disk space may remainavailable for read and write operations. The user data on the zone orset may be lost.

The storage device 110 may operate with reduced capability, for example,if a storage device 110 does not allow write operations, and switches toa read-only mode. In some embodiments, the storage device 110 may becapable of operating in two types of read-only mode: a sustainableread-only mode (which may be referred to as a “first read-only mode”),and an unsustainable read-only mode (which may be referred to as a“second read-only mode”). In the sustainable read-only mode, the storagedevice 110 may continue to serve read requests beyond the retentionperiod of the storage media 125. The unsustainable read-only mode may beemployed, for example, when it may not be feasible to operate in thesustainable read-only mode, e.g., when there is insufficient unusedstorage space to set up a rescue space. When transitioning to theunsustainable read-only mode, the storage device 110 may send to thehost 105 a notification that the storage device 110 is operating in thesecond (unsustainable) read-only mode, and that data items stored in thestorage device 110 may be allowed to expire (e.g., at the end of theirrespective retention periods). In the unsustainable read-only mode, thestorage device 110 may continue to serve read requests during theretention period of the storage media 125, and, if the storage device110 encounters data integrity issues (as detected, for example, by oneor more unsuccessful attempts to decode data during read operations),the storage device 110 may report the invalid data region.

A storage device 110 operating in the vulnerable mode may be incapableof performing normal read and/or write operations, and may perform agraceful exit, for example, by continuing to receive commands from thehost and returning errors.

Thus, in some embodiments, a storage device having one or more faultresilient features in accordance with example embodiments of thedisclosure may extend and/or organize the features so that a host mayutilize them systematically, and the device may continue to operate insome capacity despite a fault condition. In some embodiments, forexample, if the storage device is used for a RAID (Redundant Array ofIndependent (or Inexpensive) Drives) or RAIN (Redundant Array ofIndependent Nodes), and a node fails, the system may recover the data bycopying the data from the accessible space of the storage device withoutcalculating the stripe parity.

Logical Block Address Space Types

In some embodiments, various logical block address (LBA) space types maybe implemented by a storage device having fault resiliency features inaccordance with example embodiments of the disclosure. These LBA spacetypes may be used, for example, by a storage device such as thatillustrated in FIG. 1. Some examples of LBA space types may includeperforming space (P), underperforming space (UP), read-only space (RO),volatile read-only space (VRO), and inaccessible space (IA). In someembodiments, an LBA space may also refer to any unit of storage spacesuch as a page, a partition, a set, a zone, and/or the like.

Performing (P) space may include LBA space containing valid data, whichmay be capable of being read and written in a normal manner withoutsacrificing performance. Data in performing space may be valid.

Underperforming (UP) space may include LBA space containing valid data,which may be capable of being read and written in a normal manner, butwith degraded performance (e.g., degraded write performance).

Read-only (RO) space may include LBA space containing valid data, whichmay be read-only. For example, a storage device may refuse to write datareceived from a host and/or may respond with error messages to writecommands from the host directed to this type of LBA space. The data inread-only space may remain valid for a period of time exceeding theretention period.

Volatile read-only (VRO) space may include read-only space, and thestorage device may respond with error messages to write commands from ahost directed to this type of LBA space. Data in this type of LBA spacemay be temporarily valid, and may become invalid when it expires, i.e.,when the age of the data in its storage media reaches the retentionperiod of the storage media.

Inaccessible (IA) space may include LBA space containing invalid data,which may not be accessible from the host.

Fault Resilient Modes

In some embodiments, LBA space types may be used, for example, toimplement some embodiments of fault resilient modes. FIG. 2B illustratesa table of some example embodiments of fault resilient modes andassociated LBA space types that may be implemented by a storage devicein accordance with example embodiments of the disclosure. The Modecolumn of the table illustrated in FIG. 2B may include a fault resilientmode number and a name which may be used to identify the mode, forexample, in an application programming interface (API) through which oneor more features of a storage device may be accessed in accordance withexample embodiments of the disclosure. The columns labeled as P, UP, RO,VRO, and IA in the table illustrated in FIG. 2B may indicate an amountof performing (P), underperforming (UP), read-only (RO), volatileread-only (VRO), and inaccessible (IA) LBA space, respectively, that maybe used in the corresponding mode.

In some embodiments, the modes illustrated in FIG. 2B may be invoked,for example, by a host through an API. In some embodiments, a host mayquery the storage device using a get feature command as described below.In some implementations, modes identified with an asterisk (*) mayprovide a host with detailed information about each type of LBA spaceused by the mode in response to a get feature command. In someimplementations, information about the LBA space used by the othercommands may be implicit. For example, in the power cycle mode (Mode 1),all memory may be of the performing (P) type. In some embodiments,however, other combinations of LBA space types, and/or portions thereof,may be used.

In some embodiments, a storage device may implement any number of thefollowing fault resilient modes. For example, a device manufacturer mayimplement different combinations of these and other fault resilientmodes in different products.

A power cycle mode (Mode 1) may involve self-healing based on powercycling the storage device. For example, a storage device may experiencea fault condition based on one or more flipped bits in memory such asSRAM or DRAM. A flipped bit may be caused, for example, by aging,heating, and/or radiation due to an antenna or high elevations above sealevel which may interfere with memory cells. A storage device with afault resilient power cycle mode may have self-healing capabilities suchthat power cycling the storage device (e.g., removing then reapplyingpower) may reset the current state and restore the failed SSD to anormal state. In this case, one or more inflight commands in asubmission queue may be lost. Whether the user data of the storagedevice remains valid may depend on implementation details such as thepartitioning of the device, the extent to which different circuits ofthe storage controller are reset, and/or the like. In some embodiments,in a power cycle mode, the entire storage space of the storage device(100 percent) may operate normally (e.g., as performing (P) space).

A reformat mode (Mode 2) may involve self-healing based on formattingall or a portion of the storage device. In some embodiments, formattingthe storage device may reset its current state and restore the failedstorage device to its normal state. However, depending on theimplementation details (e.g., quick format, full format, partitioningdetails, and/or the like) all data on the disk may be lost. In someembodiments, in a reformat mode, the entire storage space of the storagedevice (100 percent) may operate normally (e.g., as performing (P)space).

In a reduced capacity read-only mode (Mode 3), a first portion of thestorage space (e.g., X percent) of the storage device may operatenormally (e.g., as performing (P) space), and a second portion (e.g.,(100−X) percent) may operate as read-only (RO) storage space. Thus, thesize of the performance (P) space in the storage device may be reduced,and the storage device may behave like a normal drive with respect tothat space, but the read-only (RO) type of space may not be writable. Insome embodiments, the storage device may provide a list of LBA rangesfor the performance (P) and/or read-only (RO) spaces to a host, forexample, in response to a get feature command. If the storage devicesupports the IO determinism, the LBA range may represent a set. If thestorage device supports Zoned Namespaces (ZNS), the LBA range mayrepresent a zone. In some embodiments, the storage device may alsoprovide information about address ranges for sets and/or ZNS in responseto a get feature command.

In a reduced capacity mode (Mode 4), a first portion of the storagespace (e.g., X percent) of the storage device may operate normally(e.g., as performing (P) space), and a second portion (e.g., (100−X)percent) may be inaccessible (IA). Thus, the size of the performance (P)space in the storage device may be reduced, and the storage device maybehave like a normal drive with respect to that space, but inaccessible(IA) space may not be available for normal IOs. For example, if an RTBBis exhausted, the problematic die may be excluded from the disk space,and thus, the overall disk capacity may be reduced. The storage devicemay provide a list of LBA ranges for the performance (P) and/orinaccessible (IA) type of space. If the storage device supports the IOdeterminism, the LBA range may represent a set. If the storage devicesupports ZNS, the LBA range may represent a zone. In some embodiments,the storage device may provide information about the LBA ranges, sets,zones, and/or the like, in response to a get feature command.

In a reduced performance mode (Mode 5) one or more aspects of theperformance of the storage device may be reduced. For example, thestorage device may perform normal operations, but at reduced throughputand/or latency. In some embodiments, a storage device may include one ormore back-up capacitors that, in the event of a loss of the main powersupply, may provide power to the storage device for a long enough periodof time to enable the storage device to complete a write operation. Ifone or more of these back-up capacitors fail, the storage device may notnotify a host that a write operation is complete until after the data iswritten to the media. (This may be referred to as a synchronous writeoperation.) This may reduce the input and/or output operations persecond (IOPS) and/or increase latency, thereby reducing the performanceof the storage device. Thus, in some embodiments, reduced performancemode may operate with 100 percent underperforming (UP) space. Dependingon the implementation details, some or all of the user data may remainvalid. In some embodiments, the storage device may provide speculativeperformance information to a host which may enable the host to makedecisions on sending write data to the storage device in a manner thatmay mitigate the system-level impact of the fault condition.

In a read-only mode (Mode 6), the storage device may only allow readoperations and may block external write operations. Depending on theimplementation details, data in read-only space may remain valid, forexample, after the retention period. Read-only mode may operate with 100percent read-only (RO) space.

In a partial read-only mode (Mode 7), a first portion of the storagespace (e.g., X percent) of the storage device may operate as read-only(RO) space, and a second portion (e.g., (100−X) percent) may beinaccessible (IA) space. Thus, the storage device may only allow readoperations and external write operations may be prohibited in the firstportion of the storage space. Depending on the implementation details,data in the read-only space may still valid, for example, after theretention period. The storage device may provide a list of LBA rangesfor the read-only (RO) and/or inaccessible (IA) types of space. If thestorage device supports the IO determinism, the LBA range may representa set. If the storage device supports ZNS, the LBA range may represent azone. In some embodiments, the storage device may provide informationabout the LBA ranges, sets, zones, and/or the like, in response to a getfeature command.

In a temporary read-only mode (Mode 8), data may be read from thestorage space of the storage device, which may operate with 100 percentVRO space, but external writes may be prohibited. Data in this space maybe temporarily valid but may become invalid after the retention period.

In a temporary partial read-only mode (Mode 9), data may be read from afirst portion (e.g., X percent) of the storage space of the storagedevice, which may operate as VRO space, while external writes may beprohibited. A second portion (e.g., (100−X) percent) may be inaccessible(IA) space. Data in the first portion may be temporarily valid but maybecome invalid after the retention period. If the storage devicesupports the IO determinism, the LBA range may represent a set. If thestorage device supports ZNS, the LBA range may represent a zone. In someembodiments, the storage device may provide information about the LBAranges, sets, zones, and/or the like, in response to a get featurecommand.

In a vulnerable mode (Mode 10), the storage device may not be availablefor 1/O operations. However, it may continue to receive commands fromthe host and return errors.

In a normal mode (Mode 11), the storage device may operate normally.

Commands

In some embodiments, a storage device in accordance with exampleembodiments of the disclosure may implement one or more commands whichmay be used, for example, by a host to query the storage device and/ormanage one or more features of the storage device. FIG. 2C illustrates atable of some example embodiments of commands and subcommands that maybe implemented by a storage device in accordance with exampleembodiments of the disclosure. The subcommand column of the tableillustrated in FIG. 2C may indicate a name which may be used to identifythe subcommand, for example, in an API through which the commands, andresponses thereto, may be passed.

A get feature command, which may include a subcommand as shown in thetable illustrated in FIG. 2C, may be passed from a host to a storagedevice, which may return a response thereto. In some embodiments, astorage device may respond as follows to a get feature command based onthe subcommand.

A resiliency type subcommand (FR_INFO_RESILIENCY_TYPE) may return a typeof fault resiliency in case of a failure. For example, the storagedevice may indicate which of the fault resilient modes illustrated inFIG. 2B the device has selected to operate in based on the faultcondition it has encountered.

A retention period subcommand (FR_INFO_RETENTION_PERIOD) may return anaverage retention period of the data without reprogramming the storagemedia. In some embodiments, this may be the upper-bound of retentiontime for data in the storage media from the time of the failure. Thissubcommand may be used, for example, with temporary read-only mode (Mode8) and/or temporary partial read-only mode (Mode 9).

An earliest expiry subcommand (FR_INFO_EARLIEST_EXPIRY) may return amaximum time remaining for data integrity. In some embodiments, this maybe the lower-bound of retention time for data in the storage media fromthe time of the failure. The unit of time may be determined, forexample, based on a patrol period. This subcommand may be used, forexample, with temporary read-only mode (Mode 8) and/or temporary partialread-only mode (Mode 9).

An IOPS subcommand (FR_INFO_IOPS) may return a percentage of the maximumavailable IOPS the storage device may be able to handle based on thefault condition. This subcommand may be used, for example, with reducedperformance mode (Mode 5).

A bandwidth subcommand (FR_INFO_BW) may return a percentage of themaximum available bandwidth the storage device may be able to handlebased on the fault condition. This subcommand may be used, for example,with reduced performance mode (Mode 5).

A space subcommand (FR_INFO_SPACE) may return an amount of storage spacethat may be available in the storage device based on the faultcondition. This subcommand may be used, for example, with reducedcapacity read-only mode (Mode 3) and/or reduced capacity mode (Mode 4).

A namespace capacity management command, which may include a subcommandas shown in the table illustrated in FIG. 2C, may be passed from a hostto a storage device, which may respond by performing the actionindicated by the subcommand. In some embodiments, a storage device mayrespond as follows to a namespace capacity management (NCM) commandbased on the subcommand. In some embodiments, a namespace may beimplemented as a quantity of non-volatile memory (NVM) that may beformatted into logical blocks.

A resize command (FR_NAMESPACE_RESIZE) may cause the storage device toresize a namespace based on one or more parameters that may be includedwith the command. In some embodiments, this subcommand may apply tostorage device that may support two or more namespaces. In someembodiments, the namespaces may support NVMe resizing.

A zero-sized namespace command (FR_NAMESPACE_ZERO_SIZE) may cause thestorage device to reduce the size of a rescue space to zero.

Application Programming Interface

In some embodiments, as mentioned above, a storage device in accordancewith example embodiments of the disclosure may implement an API toenable a host to query the storage device and/or manage one or morefeatures of the storage device. FIG. 2D illustrates a table of commandsthat a storage device in accordance with example embodiments of thedisclosure may implement through an API. Some embodiments may include ahierarchy of enumerated constants, within the category of faultresilient features, that the storage device may employ to respond. Asillustrated in FIG. 2D, the hierarchy may include a first level,including a fully resilient status, a partially resilient status, and avulnerable status. Sub-statuses, and sub-sub-statuses may also bedefined. For example, as illustrated in FIG. 2D, the partially resilientstatus may include a first sub-status indicating a loss of capability,and the first sub-status may include a first sub-sub-status, indicatingoperation in the sustainable read-only mode, and a secondsub-sub-status, indicating operation in the unsustainable read-onlymode. In some embodiments, an API may be implemented, for example, usingNVMe commands.

A feature command (FAULT_RESILIENT_FEATURE) may return the faultresilient classes and features in each class that the storage device maysupport.

A status command (FAULT_RESILIENT_STATUS) may return the status of thestorage device after a fault resilient recovery is performed.

A volatile blocks command (FAULT_RESILIENT_VOLATILE_BLOCKS (H)) mayreturn a list of LBA ranges that reach to the retention period in thenext H hours. In some embodiments, this may be used to determine theblocks that need to be relocated for unsustainable read-only.

An invalid data blocks command (FAULT_RESILIENT_INVALID_DATA_BLOCKS) mayreturn a list of LBA ranges that may become invalid after switching to afault resilient mode.

Additional Embodiments

FIG. 3A illustrates a flow chart of an embodiment of a method foroperating in a fault resilient mode in accordance with exampleembodiments of the disclosure. The method illustrated in FIG. 3A may beimplemented, for example, by the systems and/or components illustratedin FIG. 1. The method may begin at operation 300. At operation 305, thehost 105 may send or receives data from the storage device 110. atoperation 310, the host 105 may determine whether an error has occurredin the storage device 110 At operation 315, the storage device 110 mayperform an internal diagnosis and determine its fault resilient status(e.g., fully resilient, partially resilient, or vulnerable). Atoperation 320, the storage device 110 may modify its performance,capacity, and/or capability (e.g., transitioning to a read-only mode)based on the diagnosis. At operation 325, the storage device 110 maypost the status upon request from the host 105 based on an applicationprogramming interface (API). At operation 330, the host 105 may routedata of a given type to the storage device 110 or to a different storage110 device at a given bandwidth based on the status. The method may endat operation 335.

FIG. 3B illustrates a flow chart of an embodiment of a method ofoperating a storage device in accordance with example embodiments of thedisclosure. The method may begin at operation 350. At operation 355, themethod may determine a fault condition of the storage device. Atoperation 360, the method may select a fault resilient mode based on thefault condition of the storage device. At operation 365, the method mayoperate the storage device in the selected fault resilient mode. Themethod may end at operation 370.

The operations and/or components described with respect to theembodiment illustrated in FIGS. 3A and 3B, as well as all of the otherembodiments described herein, are example operations and/or components.In some embodiments, some operations and/or components may be omittedand/or other operations and/or components may be included. Moreover, insome embodiments, the temporal and/or spatial order of the operationsand/or components may be varied. Although some components may beillustrated as individual components, in some embodiments, somecomponents shown separately may be integrated into single components,and/or some components shown as single components may be implementedwith multiple components.

Any number of embodiments and/or variations on the embodiments disclosedherein may also be constructed. A storage controller such as a fieldprogrammable gate array (FPGA) or embedded processor may performinternal block checks and send asynchronous updates to the host 105 onthe status of the storage device 110. Events may occur and betransmitted to the host 105 (e.g., temperature, or other parametersinternal to the device). The host 105 may poll the storage devices 110on a predetermined schedule, for example, if there is no device driverfeature for providing notification. A storage controller may monitor thehistorical performance of the storage device 110 and use machinelearning to provide predictive analytics (e.g., a likelihood of thestorage device being in a given fault resilient state). Commands (e.g.,NVMe commands) may be implemented and/or expanded, for example, toreport the state of the storage device 110).

In some embodiments, the host may: (i) send different data types (e.g.,file types such as image, video, text, or high-priority or low-prioritydata), based on the status of the storage device 110 (for instance, highpriority data or real-time data may not be written to a device that isconsidered in the partially vulnerable mode); (ii) reduce thetransmission rate if the storage device 110 is in a partially vulnerablestate and in a lower performance state; (iii) send a reduced totalamount of data if the storage device 110 is in a partially vulnerableand lower capacity state; (iv) read data at the greatest rate possible,and/or store the data elsewhere, if the storage device 110 is in apartially vulnerable unsustainable read-only mode, so as to avoidexceeding the retention period (in such a circumstance, the host maycalculate the needed data rate based on the amount of data to be copiedand on the retention period); (v) ignore data “read” from a vulnerablestorage device 110 since it may be erroneous, and delete the data as itis received by the host 105; (vi) temporarily reroute read/write inputand output to a cache in a fully resilient storage device 110 that isbeing power cycled and/or formatted, based on messages that control thetiming of such events between the host and the storage devices 110. Astorage controller on a partially vulnerable storage device that has hada capacity decrease may filter incoming data writes and only write aportion of that data to the storage device 110. In some cases, thefiltering may include compression. Such a storage controller may receivevarious types of data (e.g., file types such as image, video, text, orhigh-priority or low-priority data) from a host 105 and filter based onthe status of the storage device 110. For instance, the storagecontroller may determine that high priority data may not be written to astorage device 110 that is in the partially vulnerable mode. The storagecontroller may send a rejection message to the host 105 and give areason for the rejection. Alternatively, the storage controller mayfilter out a certain type of data (e.g., image data) for writing to apartially resilient lower-capacity state storage device 110. Forexample, if a storage device 110 loses performance (e.g., operates at areduced write rate), latency-sensitive reads and writes may be rejected.

Fault Resilient System with Fault Resilient Storage Devices

In some embodiments, a RAID-0 system including an array of storagedevices 110 and a volume manager 115 may be constructed to accommodate atransition of any of the fault resilient storage devices 110 of theRAID-0 system to a read-only mode. In normal operation, the volumemanager 115 may be responsible for striping data across the array ofstorage devices 110, e.g., writing one strip of each stripe to arespective storage device 110 of the array of storage devices 110. Insuch a system, when any of the array of storage devices 110 transitionsto a read-only mode (indicated as 110A), the RAID-0 system maytransition to a second operating mode (which may also be referred to asan emergency mode), and the volume manager 115 for the array of storagedevices 110 may (i) allocate a rescue space on each of the remaining,unaffected storage devices 110B (e.g., those that remain in a read-writestate) for metadata and rescued user data from faulty storage device110A, and/or (ii) create and/or maintain a mapping table (which may alsobe referred to as an emergency mapping table). Rescue space may bepre-allocated statically prior to system operation, dynamically duringoperation, or in any combination thereof.

The rescue space (which may be indicated as R) on each storage device110A may be capable of storing n strips, where n=R/(strip size), R=C/M,C may be the capacity of each of the storage devices of the array ofstorage devices 110, and M may be the total number of storage devices.In some embodiments, the volume manager 115 may be implemented as anindependent component, or may be partially or fully integrated into thehost, a RAID controller of the RAID-0 system (which may, for example, behoused in a separate enclosure from the host), or in any otherconfiguration. In some embodiments, the volume manager 115 may beimplemented, for example, with an FPGA. The RAID-0 system may beself-contained and may virtualize the array of storage devices 110 sothat from the perspective of the host the RAID-0 system may appear as asingle storage device. In some embodiments, the volume manager may beimplemented as a processing circuit (discussed in further detail below)configured (e.g., by suitable software or firmware) to perform theoperations described herein as being performed by the volume manager.

When the RAID-0 system is operating in an emergency mode, and a writecommand is received from the host 105 requesting that data be written toa stripe of the array of storage devices 110, the volume manager 115 maycheck the emergency mapping table to determine whether the stripe isregistered e.g., whether an entry has already been made for the stripe.If no entry has been made yet (e.g., the stripe is not registered, whichmay also be referred to as open-mapped), the volume manager 115 maycreate an entry in the emergency mapping table to indicate where astrip, that ordinarily would have been written to the faulty storagedevice 110A (the storage device that has transitioned to read-onlymode), is to be written. If the emergency mapping table already containsan entry for the stripe, then the entry may be used to determine whereto write the strip that ordinarily would have been written to the faultystorage device 110A. In either case, the volume manager 115 may thenwrite each strip, as illustrated in FIG. 4A, to the array of storagedevices 110, writing the strip 405 that ordinarily would have beenwritten to the faulty (e.g., read-only) storage device 110A to rescuespace in another storage device 110B.

When a read command is received from the host 105 requesting that dataof a stripe be read from the array of storage devices 110, the volumemanager 115 may check the emergency mapping table to determine whetheran entry has been made for the stripe. If no entry has been made, then,as illustrated in FIG. 4B, the volume manager 115 may read the stripe asit would have, in ordinary operation, reading a strip from each of thestorage devices 110, including the faulty storage device 110A. If theemergency mapping table contains an entry for the stripe, then the entrymay be used to determine where to read the strip that ordinarily wouldhave been read from the faulty storage device 110A.

The remapping of strips that ordinarily would have been written to thefaulty storage device 110A may be accomplished, for example, as follows.Each storage device 110 of the array of storage devices 110 may have adrive identification number (or “drive ID”), which may be a numberbetween zero and M−1, where M may be the number of storage devices 110in the array of storage devices 110. The volume manager 115 may reassignthe drive identification numbers, e.g., assign to each unaffectedstorage device 110B of the array of storage devices 110 an alternatedrive identification number to be used for performing read or writeoperations for registered stripes (read operations for unregisteredstripes may continue to use the original drive identification numbers).The following formula (Formula A) may be used to generate the alternatedrive identification numbers:

If drive ID<faulty drive ID,

new drive ID=((drive ID−1)+M)mod(M−1)

Otherwise,

new drive ID=((drive ID−1)+(M−1))mod(M−1).  (Formula A)

The effect of Formula A may be (i) to assign, to each storage devicehaving an identification number less than the original driveidentification number of the faulty storage device, the respectiveoriginal drive identification number, and/or (ii) to assign, to eachstorage device having an identification number greater than the originaldrive identification number of the faulty storage device, the respectiveoriginal drive identification number minus one.

Using the alternate drive numbers, a target drive, to which a strip thatordinarily would have been written to the faulty storage device 110A maybe written, may be identified (e.g., on a per stripe basis) using theformula Target Drive ID=sid % (M−1) where Target Drive ID may be thealternate drive identification number of the target drive, sid may bethe stripe identifier of the strip that ordinarily may have been writtento the faulty storage device 110, and “%” may be the modulo (mod)operator.

FIG. 4C is a schematic diagram of an embodiment of a RAID-0 system inaccordance with example embodiments of the disclosure. The embodimentillustrated in FIG. 4C may include four fault resilient storage devices110 (i.e., M=4) in which the storage device identified as Drive 1 hastransitioned to a read-only mode. Using Formula A described above, Drive0 may remain mapped to new Drive ID 0 (e.g., 3 mod 3), Drive 2 may bemapped to new Drive ID 1 (e.g., 4 mod 3), and Drive 3 may be mapped tonew Drive ID 2 (e.g., 5 mod 3).

The target drive ID (e.g., for a read or write operation) may beimplicitly determined by the equation Target Drive ID=Stripe ID % (M−1).For example, if M=4 and Stripe 1 is written, Stripe ID=1, and thus,Target Drive ID=1% 3=1. That is, the target drive may be the storagedevice 110B with alternate (New) drive identification number 1 (i.e.,previous Drive 2). Within the storage device, the rescue space may besplit into strips (which may be referred to as rescue strips, orR-Strips) the size of which may be the same as the strip size. In someembodiments, the emergency mapping table may contain an entry for eachstrip having the format (Stripe ID, R-Strip ID) in which the firstelement may be the Stripe ID, and the second element may be the R-stripID on the target drive. For example, an entry of (1,0) in the emergencymapping table may indicate that Strip (1,1) is mapped to R-Strip (1,0)as shown in FIG. 4C.

FIG. 5A illustrates a flowchart for a method for operating a RAID-0system in accordance with example embodiments of the disclosure. At 505,A storage device 110 in a RAID-0 system has a fault and transitions to aread-only mode; at 510, the affected storage device 110 performs aninternal diagnosis and determines that its fault resilient status ispartially resilient and read-only; at 515, the volume manager 115determines that the affected storage device 110 is in a read-only modeand reassigns the IDs of (“live”) unaffected storage devices; at 520,the volume manager 115 receives a write operation, adds an entry to anemergency mapping table to indicate that the strip of the affecteddevice is redirected to a target (unaffected) storage device 110, andthe entire strip is written to a rescue space of the target (unaffected)storage device based on the new drive IDs of the unaffected storagedevices; and, at 525, the volume manager 115 receives a read commandfrom the host 105, and reads all strips of a stripe from the liveunaffected storage devices 110 of the RAID system while the strip of theaffected storage device is read from the rescue space of the target(unaffected) storage device.

FIG. 5B illustrates a flow chart showing details of a method foroperating a RAID-0 storage system in accordance with example embodimentsof the disclosure. The method includes, at 530, determining that thefirst storage device is in a read-only state and that the second storagedevice is in a read-write state; at 535, performing a write operation,of a first stripe, to the storage system, by writing a portion of thefirst stripe to the second storage device, and making an entry in amapping table for the first stripe; at 540, performing a first readoperation, of a second stripe, from the storage system, by reading aportion of the second stripe from the first storage device and thesecond storage device; and at 545, performing a second read operation,of the first stripe, from the storage system, by determining that themapping table includes an entry for the first stripe, and reading aportion of the first stripe from the second storage device.

System with Resilience to N-Device Failures

In some embodiments, a RAID-0 system may be constructed to accommodatethe failure of multiple (e.g., N) fault resilient storage devices 110.An example embodiment of such a system may in some ways be constructedand operate in a manner similar to the embodiment described above withrespect to FIGS. 4A-4C, but the size of the rescue space R for eachstorage device may be determined by considering the number N of faultystorage devices 110A the system may accommodate. For example, in someembodiments, a rescue space percentage ratio (e.g., b percent, which mayalso be referred to as a reservation ratio) may be used to determine thesize of the rescue space R for each storage device, where R=(b/100)*C.In some embodiments in which a system has M storage devices and mayaccommodate N faulty storage devices that may transition to read-onlymode, setting b such that N/M<=b/100 may ensure that all data from the Nfaulty storage devices 110A may be written to rescue space in theremaining unaffected storage devices (which may be referred to as livestorage devices). For example, in a system that may have five storagedevices and that may accommodate two faulty storage devices that maytransition to read-only mode, b may be set to 2/5=40 percent. Thus, thesize of the rescue space R for each storage device may be set toR=40/100*C. The rescue space R on each unaffected storage device 110Amay be capable of storing n strips, where n=R/(strip size), but in thisembodiment, R may be set to R=(b/100)*C. Rescue space may bepre-allocated statically prior to system operation, dynamically duringoperation, or in any combination thereof.

In a system that may accommodate N fault resilient storage devicefailures, M′ may represent the number of unaffected (e.g., live) storagedevices such that M′<=M. In some embodiments, the drive IDs of theunaffected storage devices 110B may be reassigned according to thefollowing formula (Formula B):

If current drive ID>current faulty drive ID

new drive ID=((current drive ID−1)+(M′−1))mod(M′−1)

Otherwise

new drive ID=((current drive ID−1)+M′)mod(M′−1).  (Formula B)

Using the alternate drive numbers, a target storage device for a writeoperation may be implicitly identified (e.g., on a per stripe basis)using the formula Target Drive ID=sid % (M′−1) where Target Drive ID maybe the alternate (new) drive identification number of the target storagedevice, and sid may be the stripe identifier of the strip thatordinarily may have been written to the faulty storage device 110A, andwhich may now be written to the target storage device having the TargetDrive ID.

FIG. 6 illustrates a schematic diagram of an embodiment of a RAID-0system in accordance with example embodiments of the disclosure. Theembodiment illustrated in FIG. 6 may include four fault resilientstorage devices 110 (i.e., M=4) in which the storage devices identifiedas Drive 1 and Drive 2 have transitioned to a read-only mode. Therefore,M′=2. Using Formula B described above, Drive 0 may remain mapped to newDrive ID 0 (e.g., 1 mod 1), and Drive 3 may be mapped to new Drive ID 1(e.g., 3 mod 1).

Also, using the formula Target Drive ID=sid % (M′−1), if stripe 1 iswritten, Stripe ID=1, and thus, Target Drive ID=1% 2=1. That is, thetarget drive may be the storage device 110B with alternate (New) driveidentification number 1 (i.e., previous Drive 2).

In some embodiments, when a first faulty storage device 110A transitionsto a read-only mode, the RAID-0 system may transition to an emergencymode in which the volume manager 115 may (i) allocate a rescue space oneach of the remaining, unaffected storage devices 110B (if adequaterescue space has not been allocated already, or if insufficient spacehas been allocated) for metadata and rescued user data from a faultystorage device 110A, and/or (ii) create and/or maintain a first mappingtable for the first faulty storage device 110A. The RAID-0 system maythen operate in a manner similar to the single device failure embodimentdescribed above.

In some embodiments, if a second faulty storage device 110A transitionsto a read-only mode, the RAID-0 system may once again allocate a rescuespace on each of the remaining, unaffected storage devices 110B (ifadequate rescue space has not been allocated already, or if insufficientspace has been allocated) for metadata and rescued user data from afaulty storage device 110A. In some embodiments, the RAID-0 system maythen create and/or maintain a second mapping table for the second faultystorage device 110A. Each of the mapping tables may be designated as theLth mapping table, where L=1 . . . M′, and the Lth mapping tablecorresponds to the Lth faulty storage device. In other embodiments, aRAID-0 system may create and/or modify a single mapping table to mapdata stripes and/or strips of all of the faulty storage devices 110A tothe unaffected storage devices 110B. In some embodiments, one or moremapping tables may be stored in a reserved rescue space, for example,before a Disk Data Format (DDF) structure for a RAID configuration.

The RAID-0 system may then reassign drive IDs of the unaffected storagedevices 110B, for example, based on Formula B, and proceed to operatewith the two faulty storage devices 110A operating in read-only mode.

When a read command is received from the host, the volume manager 115may check the one or more emergency mapping tables to determine whetheran entry has been made for the stripe to be read. If no entry has beenmade, then the volume manager 115 may read the stripe as it would have,in ordinary operation, reading a strip from each of the storage devices110, including the two faulty storage devices 110A. If the one or moreemergency mapping tables contain an entry for the stripe, then the entrymay be used to determine where to read the strip that ordinarily wouldhave been read from one or both of the faulty storage devices 110A.

When a write command is received from the host, the volume manager 115may check the one or more emergency mapping tables to determine whetheran entry has been made for the stripe. If no entry has been made yet(e.g., the stripe is not registered) the volume manager 115 may createan entry in the one or more emergency mapping tables to indicate wherethe strips that ordinarily would have been written to the faulty storagedevices 110A (the storage devices that have transitioned to read-onlymode), are to be written. If the one or more emergency mapping tablesalready contain an entry for the stripe, then the entry may be used todetermine where to write the strips that ordinarily would have beenwritten to the faulty storage devices 110A. In either case, the volumemanager 115 may then write the strips to the array of storage devices110, writing the strips that ordinarily would have been written to thefaulty (e.g., read-only) storage devices 110A to rescue space in theother storage devices 110B.

Rescue Space Management with Data Block Write

FIG. 7 illustrates a schematic diagram of an embodiment of a RAID-0system that may implement rescue space management with data blockwriting in accordance with example embodiments of the disclosure. Theembodiment illustrated in FIG. 7 may include four fault resilientstorage devices 110 (i.e., M=4) in which the storage device identifiedas Drive 1 has transitioned to a read-only mode. The embodimentillustrated in FIG. 7 may in some ways be constructed and operate in amanner similar to the embodiment described above with respect to FIGS.4A-4C, but rather than redirecting an entire strip from a faulty storagedevice 110A to the rescue space of a single unaffected storage device110B, a strip may be spit into rescue blocks (which may also be referredto as R-blocks) that may be distributed across the rescue spaces of someor all of the remaining unaffected storage devices 110B.

Within each storage device 110, some or all of the rescue space may besplit into rescue blocks (which may be referred to as R-blocks). Thesize of R-blocks may be set, for example, to the same size as a datablock size used generally by the storage device.

In some embodiments, the volume manager 115 may maintain an emergencymapping table in which each entry may simply be a stripe ID to indicatethat the stripe has been mapped to the rescue space in the unaffectedstorage devices 110B. For example, in the embodiment illustrated in FIG.7 in which the storage device 110A designated as Drive ID 1 has enteredread-only mode, an entry in the emergency mapping table of (stripeID)=(1) may indicate that Strip (1,1) is split into 3 chunks and mappedto all unaffected (e.g., live) storage devices 110B.

In some embodiments, the portion of the strip from the faulty storagedevice that may be stored in the rescue space of each storage device(which may be referred to as a chunk) may be equal to the stripsize/block size/(M−1). To accommodate a strip size and block size thatmay not be evenly divided into the number of unaffected storage devices110B, the chunk stored in the rescue space of a target storage device110B that satisfies the formula Target Drive ID<(strip size/block size)mod (M−1) may include an extra block. Thus, for the example illustratedin FIG. 7 in which Strip (1,1) may include 10 blocks, the chunk of Strip(1,1) stored in new Drive ID 1 may include 3 blocks, the chunk stored innew Drive ID 2 may include 3 blocks, but the chunk stored in new DriveID 0 may include 4 blocks because new Drive ID 0 is less than 10 mod 3(which equals 1).

In the embodiment illustrated in FIG. 7, the target drive ID (e.g., fora read or write operation) may be implicitly determined, for example, bythe equation Target Drive ID=Stripe ID % (M−1) where M may indicate thenumber of storage devices 110. For example, if M=4 and Stripe 1 iswritten, Stripe ID=1, and thus, Target Drive ID=1%3=1.

In the embodiment illustrated in FIG. 7, the volume manager 115 mayreassign the drive identification numbers for each unaffected storagedevice 110B using, for example, Formula C as follows:

If drive ID<faulty drive ID,

new drive ID=((drive ID−1)+M)mod(M−1)

Otherwise,

new drive ID=((drive ID−1)+(M−1))mod(M−1).  (Formula C)

In some embodiments, the size of rescue space R in each storage device110 may be set, for example, to R=C/M, where C may be the capacity ofeach of the storage devices of the array of storage devices 110, and Mmay be the total number of storage devices. The rescue space R in eachstorage device 110 may be capable of storing n blocks, where n=R/(blocksize).

When a read command is received from the host, the volume manager 115may check the emergency mapping table to determine whether an entry hasbeen made for the stripe of the strip to be read. If no entry has beenmade, then the volume manager 115 may read the stripe as it would havein ordinary operation, reading a strip from each of the storage devices110, including the faulty storage device 110A. If the emergency mappingtable contains an entry for the stripe, the chunks of the stripcorresponding to the faulty storage device 110A (in this example,Drive 1) may be read from the rescue space of the unaffected storagedevices 110B (in this example, the storage devices with new Drive IDs 0,1, and 2) and reassembled into Strip (1,1).

When a write command is received from the host, the volume manager 115may check the emergency mapping table to determine whether an entry hasbeen made for the stripe of the strip to be written. If no entry hasbeen made yet (e.g., the stripe is not registered) the volume manager115 may create an entry in the emergency mapping table to indicate thatchunks of the strip that ordinarily would have been written to thefaulty storage device 110A (the storage device that has transitioned toread-only mode), are to be written to the unaffected storage devices110B. If the emergency mapping table already contains an entry for thestripe, then the entry may be used to determine that chunks of the stripthat ordinarily would have been written to the faulty storage device110A (the storage device that has transitioned to read-only mode), areto be written to the unaffected storage devices 110B. In either case,the volume manager 115 may then write the chunks of the strip originallyintended for Drive 1, to the rescue spaces of the unaffected storagedevices 110B as illustrated in FIG. 7.

Fault Resilient System with Quality-of-Service Management

In some embodiments, a fault resilient (FR) storage system such as anFR-RAID-0 system may implement one or more quality-of-service (QoS)management features in accordance with example embodiments of thedisclosure. For example, a user and/or volume manager may adjust thesize of strips in a RAID striping configuration, and/or the writingtechnique used to write data to in a rescue space on one or more storagedevices in the RAID configuration, to provide a specific QoS level.

FIG. 8 illustrates an example embodiment of a system for implementingQoS management in a storage system in accordance with exampleembodiments of the disclosure. The embodiment illustrated in FIG. 8 mayinclude a QoS manager 802 configured to implement one or more QoSfeatures for a storage array 804, for example, through one or morecontrol and/or configuration inputs 806. The storage array 804, whichmay be arranged, for example, as a RAID-0 array, may include a volumemanager 815 and any number of storage devices 810. In some embodiments,the storage array 804 may be implemented at least in part with any ofthe fault resilient storage devices, systems, and/or methods disclosedherein.

The QoS manager 802 may include QoS logic 808 that may receive, utilize,control, configure, direct, notify, and/or the like, any number ofparameters relating to QoS such as the number of storage devices 811A inthe storage array 804, the number of data blocks in a strip 811B, one ormore write methods 811C used in a rescue space, the number of faultystorage devices 811D that may be accommodated by the storage array 804,the capacity or capacities 811E of storage devices used in the storagearray 804, and/or the like.

For example, in some embodiments, a QoS metric such as performance maybe influenced by the parameters 811A-811E in any number of the followingmanners. Increasing the number of storage devices 811A in the storagearray 804 may increase performance, for example, in terms of storagecapacity, latency, throughput, and/or the like. The number of datablocks 811B in a strip may be tuned based on the type of anticipatedstorage transactions. For example, using larger data blocks may providegreater throughput with larger, less frequent transactions, whereassmaller data blocks may provide a greater number of input and/or outputoperations per second (IOPS) with small, more frequent transactions. Thewrite method 811C may also be tuned, for example, because writing datablocks to rescue spaces on multiple storage devices may take less timethan writing a strip to the rescue space of a single storage device.Increasing the number of faulty storage devices 811D that may beaccommodated by the storage array 804 may reduce performance, forexample, because accommodating more faulty devices may involve moreallocating a greater percentage of storage device capacity

The QoS manager 802 may operate automatically, manually, or in anycombination thereof. The QoS manager 802 may operate automatically, forexample, in response to monitoring one or more parameters 812 from thestorage array 804. The QoS manager 802 may operate manually, forexample, in response to one or more parameters 814 received through auser interface 816. Additionally, the QoS manager 802 may provide one ormore outputs 818 through the user interface 816 that may instruct a userto take one or more specific actions, for example, to add and/or removeone or more storage devices 810.

In some embodiments, given system requirements from a user, the QoSmanager 802 may determine one or more parameters based on storageperformance information. For example, a user may specify that thestorage array 804 may operate as an RF-RAID-0 configuration that mayaccommodate one storage device failure with 500K IOPS for 32K blocks anda total storage capacity of 8 TB. Based on these inputs, the QoS manager802 may determine the following parameters to arrive at a number ofstorage devices that may be used to provide the specified performance:

Storage device capacity: 1 TB;4K write IOPS per storage device: 400K;32K write IOPS per storage device: 200K; andRAID strip size: 32K.Solving for capacity: (1−1/M)*2*(M−1)>=8, M{circumflex over( )}2−6M+1>0, (M−3){circumflex over ( )}2>8, and thus, M=6.Solving for performance: 200 K*(M−1)/2>=500 K, and thus, M=6.Therefore, six storage devices may be used to provide the specifiedperformance.

In some embodiments, the QoS manager 802 and/or QoS logic 808 may beimplemented with hardware, software, or any combination thereof,including combinational logic, sequential logic, one or more timers,counters, registers, state machines, CPLDs, FPGAs, ASICs, CISCprocessors and/or RISC processors, and/or the like executinginstructions stored in volatile memories such as DRAM and/or SRAM,nonvolatile memory such as flash memory and/or the like, as well asGPUs, NPUs, and/or the like. The QoS manager 802 and/or QoS logic 808may be implemented as one or more separate components, integrated withone or more other components such as the volume manager 815, a host,and/or any combination thereof.

FIG. 9 illustrates an embodiment of a method of operating a storagearray in accordance with example embodiments of the disclosure. Themethod may begin at operation 902. At operation 904, the method maydetermine a first fault resilient operating mode of a first faultresilient storage device of the storage array. At operation 906, themethod may determine a second fault resilient operating mode of a secondfault resilient storage device of the storage array. At operation 908,the method may allocate one or more rescue spaces of one or moreadditional fault resilient storage devices of the storage array. Atoperation 910, the method may map user data from the first faultresilient storage device to the one or more rescue spaces. At operation912, the method may map user data from the second fault resilientstorage device to the one or more rescue spaces. The method may end atoperation 914.

FIG. 10 illustrates an embodiment of another method of operating astorage array in accordance with example embodiments of the disclosure.The method may begin at operation 1002. At operation 1004, the methodmay allocate a first rescue space of a first fault resilient storagedevice of the storage array. At operation 1006, the method may allocatea second rescue space of a second fault resilient storage device of thestorage array. At operation 1008, the method may determine a faultresilient operating mode of a third fault resilient storage device ofthe storage array. At operation 1010, the method may map user data fromthe third fault resilient storage device to the first rescue space andthe second rescue space based on determining the fault resilientoperating mode. The method may end at operation 1012.

FIG. 11 illustrates an embodiment of a further method of operating astorage array in accordance with example embodiments of the disclosure.The method may begin at operation 1102. At operation 1104, the methodmay determine a first parameter of a first fault resilient storagedevice of the storage array. At operation 1106, the method may determinea second parameter of a second fault resilient storage device of thestorage array. At operation 1108, the method may determine aquality-of-service (QoS) of the storage array based on the firstparameter and the second parameter. The method may end at operation1110.

The operations and/or components described with respect to theembodiments illustrated in FIGS. 9-11, as well as all of the otherembodiments described herein, are example operations and/or components.In some embodiments, some operations and/or components may be omittedand/or other operations and/or components may be included. Moreover, insome embodiments, the temporal and/or spatial order of the operationsand/or components may be varied.

The embodiments described above have been described in the context ofvarious implementation details, but the principles of this disclosureare not limited to these or any other specific details. For example,some storage arrays have been described in the context of systems inwhich the capacity and/or size of storage devices and/or rescue spacesmay be the same for each storage devices, but different capacity and/orsize of storage devices and/or rescue spaces may be used. As anotherexample, some embodiments have been described in the context of RAIDsystem such as RAID-0, but the principles may also be applied to anyother type of storage arrays.

As another example, some functionality has been described as beingimplemented by certain components, but in other embodiments, thefunctionality may be distributed between different systems andcomponents in different locations and having various user interfaces.Certain embodiments have been described as having specific processes,operations, etc., but these terms also encompass embodiments in which aspecific process, step, etc. may be implemented with multiple processes,operations, etc., or in which multiple processes, operations, etc. maybe integrated into a single process, step, etc. A reference to acomponent or element may refer to only a portion of the component orelement. For example, a reference to an integrated circuit may refer toall or only a portion of the integrated circuit, and a reference to ablock may refer to the entire block or one or more subblocks. The use ofterms such as “first” and “second” in this disclosure and the claims mayonly be for purposes of distinguishing the things they modify and maynot indicate any spatial or temporal order unless apparent otherwisefrom context. In some embodiments, based on” may refer to “based atleast in part on.” In some embodiments, “disabled” may refer to“disabled at least in part.” A reference to a first element may notimply the existence of a second element. Various organizational aidssuch as section headings and the like may be provided as a convenience,but the subject matter arranged according to these aids and theprinciples of this disclosure are not defined or limited by theseorganizational aids.

The various details and embodiments described above may be combined toproduce additional embodiments according to the inventive principles ofthis patent disclosure. Since the inventive principles of this patentdisclosure may be modified in arrangement and detail without departingfrom the inventive concepts, such changes and modifications areconsidered to fall within the scope of the following claims.

1. A method of operating a storage array, the method comprising:allocating a first rescue space of a first fault resilient storagedevice of the storage array; allocating a second rescue space of asecond fault resilient storage device of the storage array; determininga fault resilient operating mode of a third fault resilient storagedevice of the storage array; and mapping user data from the third faultresilient storage device to the first rescue space and the second rescuespace based on determining the fault resilient operating mode.
 2. Themethod of claim 1, wherein: a first block of the user data is mapped tothe first rescue space; and a second block of the user data is mapped tothe second rescue space.
 3. The method of claim 1, wherein the user datacomprises a strip of data.
 4. The method of claim 3, wherein: a firstportion of the strip of data is mapped to the first rescue space; andthe first portion of the strip of data comprises a number of data blocksbased on a size of the strip of data and a size of the data blocks. 5.The method of claim 4, wherein the number of data blocks is furtherbased on a total number of storage devices in the storage array.
 6. Themethod of claim 1, further comprising reassigning at least one deviceidentifier (ID) of the first fault resilient storage device to a deviceID of the third fault resilient storage device.
 7. The method of claim1, further comprising redirecting one or more inputs and/or outputs(IOs) from the third fault resilient storage device to the first rescuespace and the second rescue space.
 8. The method of claim 1, wherein thefirst rescue space has a capacity based on a capacity of the first faultresilient storage device and a total number of storage devices in thestorage array.
 9. The method of claim 3, wherein the first rescue spacehas a capacity of strips based on a size of the first rescue space and ablock size.
 10. A system comprising a storage array comprising: a firstfault resilient storage device; a second fault resilient storage device;a third fault resilient storage device; and a volume manger configuredto: allocate a first rescue space of the first fault resilient storagedevice; allocate a second rescue space of the second fault resilientstorage device; determine a fault resilient operating mode of the thirdfault resilient storage device; and map user data from the third faultresilient storage device to the first rescue space and the second rescuespace based on determining the fault resilient operating mode.
 11. Thesystem of claim 10, wherein the volume manager is further configured to:map a first block of the user data to the first rescue space; and map asecond block of the user data to the second rescue space.
 12. The systemof claim 10, wherein: the user data comprises a strip of data; and thevolume manager is further configured to map a first portion of the stripof data to the first rescue space.
 13. The system of claim 12, whereinthe first portion of the strip of data comprises a number of data blocksbased on a size of the strip of data and a size of the data blocks. 14.The system of claim 13, wherein the number of data blocks is furtherbased on a total number of storage devices in the storage array.
 15. Amethod of operating a storage array, the method comprising: determininga first parameter of a first fault resilient storage device of thestorage array; determining a second parameter of a second faultresilient storage device of the storage array; and determining aquality-of-service (QoS) of the storage array based on the firstparameter and the second parameter.
 16. The method of claim 15, furthercomprising adjusting the first parameter based on the QoS.
 17. Themethod of claim 16, wherein the first parameter is adjustedautomatically based on monitoring the first parameter.
 18. The method ofclaim 16, wherein the first parameter is adjusted automatically based onmonitoring the second parameter.
 19. The method of claim 16, wherein thefirst parameter is adjusted by configuring a component of the storagearray.
 20. The method of claim 16, wherein the first parameter isadjusted by controlling the operation of a component of the storagearray.
 21. The method of claim 15, wherein the first parameter comprisesone of a number of storage devices in the storage array, a number ofdata blocks in a strip of user data for the first fault resilientstorage device, a write method for redirecting data from the first faultresilient storage device to the second fault resilient storage device, anumber of faulty storage devices supported by the storage array, or astorage capacity of the first fault resilient storage device.
 22. Asystem comprising a storage array comprising: a first fault resilientstorage device; a second fault resilient storage device; and a volumemanger configured to: determine a first parameter of a first faultresilient storage device; determine a second parameter of a second faultresilient storage device; and determine a quality-of-service (QoS) ofthe storage array based on the first parameter and the second parameter.23. The system of claim 22, wherein the volume manger is furtherconfigured to adjust the first parameter based on the QoS.
 24. Thesystem of claim 23, wherein the volume manger is further configured toadjust the first parameter automatically based on monitoring the firstparameter.
 25. The system of claim 23, wherein the volume manger isfurther configured to adjust the first parameter automatically based onmonitoring the second parameter.